Target-Side Generation of Prepositions for SMT

نویسندگان

  • Marion Weller
  • Alexander M. Fraser
  • Sabine Schulte im Walde
چکیده

We present a translation system that models the selection of prepositions in a targetside generation component. This novel approach allows the modeling of all subcategorized elements of a verb as either NPs or PPs according to target-side requirements relying on source and target side features. The BLEU scores are encouraging, but fail to surpass the baseline. We additionally evaluate the preposition accuracy for a carefully selected subset and discuss how typical problems of translating prepositions can be modeled with our method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling Complement Types in Phrase-Based SMT

We explore two approaches to model complement types (NPs and PPs) in an Englishto-German SMT system: A simple abstract representation inserts pseudo-prepositions that mark the beginning of noun phrases, to improve the symmetry of source and target complement types, and to provide a flat structural information on phrase boundaries. An extension of this representation generates context-aware synt...

متن کامل

Handling Multiword Expressions in Phrase-Based Statistical Machine Translation

Preprocessing of the parallel corpus plays an important role in improving the performance of a phrase-based statistical machine translation (PB-SMT). In this paper, we propose a frame work in which predefined information of Multiword Expressions (MWEs) can boost the performance of PB-SMT. We preprocess the parallel corpus to identify Noun-noun MWEs, reduplicated phrases, complex predicates and ...

متن کامل

Syntactic SMT Using a Discriminative Text Generation Model

We study a novel architecture for syntactic SMT. In contrast to the dominant approach in the literature, the system does not rely on translation rules, but treat translation as an unconstrained target sentence generation task, using soft features to capture lexical and syntactic correspondences between the source and target languages. Target syntax features and bilingual translation features ar...

متن کامل

Enriching SMT Training Data via Paraphrasing

This paper proposes a novel method to resolve the coverage problem of SMT system. The method generates paraphrases for source-side sentences of the bilingual parallel data, which are then paired with the target-side sentences to generate new parallel data. Within a statistical paraphrase generation framework, we employ an object function, named Sentence Novelty, to select paraphrases which havi...

متن کامل

Predicting Prepositions for SMT

Representation and Prediction Features Initial experiments showed that replacing prepositions by simple place-holders decreases the translation quality. As an extension to the basic approach with plain place-holders, we thus experiment with enriching the place-holders such that they contain more relevant information and represent the content of a preposition while still being in an abstract for...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015